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Songbirds have become impressive neurobiological models for aspects of human verbal communi- 
cation because they learn to sequence their song elements, analogous, in some ways, to how humans 
learn to produce spoken sequences with syntactic structure. However, mammals such as non- 
human primates are considered to be at best limited-vocal learners and not able to sequence 
their vocalizations, although some of these animals can learn certain 'artificial grammar' sequences. 
Thus, conceptual issues have slowed the progress in exploring potential neurobiological homologues 
to language-related processes in species that are taxonomically closely related to humans. We con- 
sider some of the conceptual issues impeding a pursuit of, as we define them, 'proto-syntactic' 
capabilities and their neuronal substrates in non-human animals. We also discuss ways to better 
bridge comparative behavioural and neurobiological data between humans and other animals. 
Finally, we propose guiding neurobiological hypotheses with which we aim to facilitate the future 
testing of the level of correspondence between the human brain network for syntactic-learning 
and related neurobiological networks present in other primates. Insights from the study of non- 
human primates and other mammals are likely to complement those being obtained in birds to 
further our knowledge of the human language-related network at the cellular level. 
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1. INTRODUCTION 

If you can find a path with no obstacles, it probably 
doesn't lead anywhere. 

Frank A. 'Parson' Clark, ca. 1963 

The path towards understanding the behavioural abil- 
ities and neuronal substrates that are evolutionarily 
related to those that humans use for language has 
been as challenging as it has been informative. 
Recently, we have seen considerable advances in 
modern language theory [1-4] and in our understand- 
ing of language-related processes (for recent reviews on 
the neurobiology of syntax, see: Bickerton & Szathmary 
[5]). Concurrently, work in non-human animals has 
seen the development of theoretical frameworks on 
the evolutionary origins of language-related processes 
[1,6-9]. This has led to an increase in comparative 
animal studies on 'artificial grammar learning' (AGL) 
[10-12]. As we consider below, AGL paradigms aim 
to tap into the computational abilities that humans 
use to learn syntactically structured sequences 
[9,13,14]. Moreover, songbirds have recently become 
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an important neurobiological model system, in part, 
because they learn their vocalizations and because 
their song production seems to reveal 'syntactic-like' 
abilities that are in some ways related to how humans 
learn to produce language with syntactic structure 
[6, 1 5] . These are all exciting developments, but, argu- 
ably, one area that remains relatively underdeveloped is 
in advancing mammalian model systems that can pro- 
vide insights on the cellular mechanisms that might be 
homologous to those that the human brain uses to sup- 
port language-related processes. In particular, 
additional comparative work with non-human primates, 
although faced with considerable challenges as we con- 
sider in this paper, is needed to inform us on the 
evolutionary changes that are likely to have occurred 
within the primate order as language evolved in 
humans [8] . Interdisciplinary efforts will remain impor- 
tant for advancing future treatments for communication 
and language disorders, and it is likely that major 
advances will be difficult to achieve if research efforts 
are limited to the study of select animal species or to 
the non-invasive approaches that are normally available 
for studying humans. 1 

In this paper, we focus on the conceptual and techni- 
cal challenges that are faced in pursuing evolutionarily 
homologues to human syntactic-learning in mammals 
such as non-human primates. We provide a description 
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of what we define here as 'proto-syntactic' processes and 
how we might go about studying these behaviourally 
and neurobiologically and in ways that can facilitate 
comparative testing with humans and other animals. 
We conclude by reviewing recent perspectives on the 
structure and function of the human brain network for 
syntactic processes, and propose several neurobiological 
hypotheses that consider the possible combinations of 
behavioural sequencing capabilities and neurobiological 
substrates with which different non-human primate 
species might present. 

2. A CONCEPTUAL FRAMEWORK FOR THE 
PURSUIT OF PROTO-SYNTACTIC CAPABILITIES 
AND PROCESSES IN NON-HUMAN ANIMALS 

Syntax can be defined as the ability to learn and to 
produce grammatical relations between words and 
word parts in a sentence. However, syntax is not 
simply the linear sequencing of words (i.e. evaluating 
the word-by-word relationships between elements in 
a string). Although we speak and write word- 
by-word, modern linguistic theory emphasizes that 
beneath the surface-level of word sequences is an 
underlying structure, such as hierarchically nested 
phrases and 'movement' (perceived or actual) of syn- 
tactic constituents [1,2,5,6,17]. In this section, we 
consider: (i) two examples of operational definitions 
of syntactic abilities that could be comparatively 
studied with non-human animals; (ii) the important 
distinction between production and learning, the 
latter of which allows us to ask questions about the 
learning abilities of animals, which might be better 
than their vocal production capabilities; and (iii) the 
idea of an evolutionary gradient in syntactic complexity 
to help us to understand how human syntactic abilities 
may have evolved from simpler systems. In this regard, 
we define proto-syntactic abilities as those that reflect an 
evolutionary increase in computational processing 
capabilities, which comparative testing might reveal 
to have formed an evolutionary basis for human syn- 
tactic abilities. 

(a) A place to start: creating operational 
definitions of syntactic abilities for 
comparative testing 

Most definitions of syntax reflect capabilities that are 
uniquely human, such as the ability to learn to pro- 
duce and evaluate considerable levels of complexity 
in the hierarchical structure of sentences. Since 
no other animals have syntax, grammar, words, sen- 
tences, semantics, etc. as they are defined for human 
language, the first major hurdle for comparative 
study is to be as clear as possible about the operational 
definition of the core aspects of some of these abilities 
that one hopes to study with other animals. Each 
operational definition will suggest and constrain the 
ways in which these abilities can be comparatively 
studied and with which species these could be 
realistically explored. 

As one example, we might be interested in studying 
a general aspect of syntactic sequencing ability, oper- 
ationally defined as follows: An aspect of syntactic 
structure building is present in animals that can 



learn to produce structural relationships between their indi- 
vidual vocalizations (what we might call 'syntactic-like' 
ability). The italicized phrase, however, suggests that 
we would need to study species of animals that are 
vocal learners and have communication systems that 
allow them to combine several of their vocalizations 
in some sort of a sequence for production. Syntac- 
tic-like abilities in non-human animals seem to be 
closely associated with vocal imitation and vocal 
learning, such as when songbirds and humpback 
whales learn to structure their songs. The few 
animal species known to be vocal learners (humans, 
songbirds, parrots, hummingbirds, bats, elephants, 
pinnipeds and cetaceans, [8,15,18-22]) have varying 
degrees of syntactic-like capabilities. Of these groups 
of animals, not all are being neurobiologically studied. 
Thus, some groups of songbirds have become repre- 
sentative neurobiological animal model systems for 
vocal production learning and syntactic-like abilities. 
Moreover, although different songbirds show varying 
levels of song complexity, the structure of their 
songs are typically described as exhibiting 'phonologi- 
cal syntax' [5,23], where different sequencing 
combinations of the units do not produce different 
meanings (referred to as 'semantically compositional 
syntax' in humans). 

Many mammals, non-human primates included, 
have a call-based system for vocal communication 
that lacks the sequencing abilities of songbirds or ceta- 
ceans. Most non-human primates are generally 
thought to produce unitary calls from a limited set of 
innate or genetically regulated vocalizations, although 
this perspective is changing somewhat. Recently, 
Snowdon [24] reviewed the evidence for vocal learning 
in non-human primates, stating: 'None of these 
new results suggest that primates will soon challenge 
songbirds for vocal virtuosity, but nonetheless the 
accumulation of results suggests a much greater 
degree of vocal control and flexibility of production 
than previously thought.', also see [8,25]. Moreover, 
some species of guenons (Old World monkeys) 
appear to combine their calls into different context- 
specific call combinations [26,27]. However, as with 
songbirds, these call combinations lack semantic 
compositionality [1,2]. 

Therefore, an operational definition, such as the 
following, is required to help us to remain empirically 
grounded regarding the limited vocal production 
learning and sequencing capabilities of non-human 
primates: A core aspect of the human syntactic 
capacity, to learn how sensory elements are appropri- 
ately sequenced, might exist in mammals that are 
able to evaluate whether sequences of auditory or 
visual elements violate a previously learned structure. 
This operational definition differs from the one for 
vocal learners above in two key respects. First, it 
does not depend on the vocal production capabilities 
of the animals, which theoretical papers on language 
evolution have suggested is not necessary [7]. 
Second, it draws a distinction between learning and 
production, suggesting that some animals might be 
able to learn sequences of sensory elements better 
than they are able to (re) produce them. We evaluate 
the basis for this claim next. 
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(b) The distinction between vocal production 
learning and auditory learning 

It is well known that human receptive capabilities can 
outstrip productive capabilities. Any learner of a 
second language will be familiar with the feeling that 
their ability to understand that language exceeds 
their ability to produce well-formed sentences in it, 
and we know that infants are sensitive to certain prop- 
erties of their native language before they can use them 
[28]. A related distinction is made between 'auditory 
learning' and Vocal production' by comparative scien- 
tists because many vertebrates are capable of some 
form of auditory learning although very few species 
are also production vocal learners [19]. Linguists 
tend to focus on receptive abilities when they 
evaluate human language — particularly the ability to 
differentiate between well-formed and ill-formed 
(ungrammatical) sentences. However, when scientists 
look for correspondence to abilities in other animals 
there is a strong tendency to focus on production 
[1], such as syntactic-like abilities in songbirds 
[6,15]. Although many vertebrates are often con- 
sidered to be vocal non-learners, many of these 
animals are capable of considerable auditory learning 
[25,29]. Thereby, the extent to which different 
animal species can learn varying levels of complexity 
in how sensory elements are temporally sequenced 
remains an open question and is an issue that remains 
linguistically relevant. 

(c) The notion of an evolutionary gradient of 
syntactic complexity 

The formal language hierarchy (FLH; or extended 
Chomsky hierarchy [4]) contains several categories of 
grammar, each describing an increasingly powerful 
computational language (see figure la, which is 
based on Berwick et al. [6]). Here, lower ranked gram- 
mars (e.g. finite-state grammars (FSGs); also referred 
to as c sub-regular' grammars [4]) generate sets of 
languages that are subsets of the sets of languages 
generated by higher ranked grammars. Humans seem 
to be unique in the animal kingdom in being able to 
produce languages that breach into the realm of con- 
text-sensitive languages [30] (figure la). However, as 
Hurford notes: ' . . . linguists pay little attention to 
classes of languages of [the] lowly rank on the 
Formal Language Hierarchy.' [1]. In our view, this 
has resulted in a lack of resolution of the level of com- 
plexity of FSGs that are not human unique, leading to 
an emphasis on determining whether the status of 
some non-human animal species can be elevated if 
they are able to learn context-free patterns from 
context-free grammars (CFGs; also referred to as 
'supra-regular' grammars [3,4,31]). Moreover, the 
interpretation that songbirds can learn CFG [12,32] 
has been questioned for several reasons considered in 
detail elsewhere [6,33,34], leading Berwick et al. [6] 
to conclude that: 'Considerable controversy remains 
as to whether any nonhuman species can truly recog- 
nize strictly context-free patterns'. Context-free 
pattern learning may someday be demonstrated in cer- 
tain animals [4], yet, even if it is not, it remains 
important to understand how human capabilities 
with CFGs and beyond may have evolved from 



abilities lower in the FLH that are present in other 
living animals. This requires better resolution of the 
lower parts of the hierarchy (figure 1) and consider- 
ation of the distinction between learning — as a 
behavioural measure of reception — and production. 
As we schematize in figure lb for humans and other 
species of animals, these two behavioural phonotypes 
should be distinguished (see [25]). 

How might the ability to generate context-free 
languages or beyond have evolved? One possibility is 
that when the ancestors to living humans began to 
organize vocalizations and then words into sentences 
of greater complexity, this built upon the evolutiona- 
ry conserved ability to process sets of serially 
ordered strings. Then at some point selective press- 
ures to reduce memory demands may have 
expanded syntactic capabilities by the adoption of 
rule-based learning strategies that avoid having to 
memorize all the elements and transitions in the 
sequences from more complex grammars [35]. In 
this regard, we are motivated by Hurford 's attempts 
to resolve several of the stages below CFGs in 
reference to the various levels of complexity seen in 
the songs of songbirds and humpback whales [1]. 
We will expand on some of his ideas to illustrate 
our proposed notion of an evolutionary gradient of 
syntactic complexity. See Jager & Rogers [4] for other 
approaches to resolve the sub-regular grammar space 
in the extended Chomsky hierarchy. 

One of the simplest scenarios is for a system to 
recognize and/or to generate single elements. Such is 
the case for animals with call-based systems that can 
produce and recognize single vocalizations from a lim- 
ited set of vocalizations (figure Ic). The next level of 
sequencing complexity is introduced when two calls 
are combined where it then becomes important to 
evaluate the 'adjacent relationships' between element 
pairs. The subsequent level of complexity occurs 
when several elements are serially sequenced in a 
purely linear fashion. An example of this is the linear 
song components of, for example, zebra finch songs 
[36], where the pairwise transitions can be modelled 
by a first-order Markov process [1] (figure Ic). 
Adding more elements or transitions does not 
change the computational complexity of the pairwise 
sequencing process, but requires a larger indexical 
memory store. Some songbirds, such as Bengalese 
finches, nightingales and chaffinches, and humpback 
whales have songs that show sequencing elaborations 
such as forward or backward branching relationships 
and elaborations such as repeating elements within a 
range of acceptable repetitions. While it is not always 
clear which of these would be hierarchically higher 
than the others in terms of syntactic complexity 
(however defined), these sorts of transitions deviate 
from strictly linear processes [37], although these 
cases still only require first-order Markov processes 
to model them (figure Ic). Of special interest are the 
branching transitions since these can be modelled 
either as a number of adjacent relationships, or could 
include more complex 'non-adjacent relationships' 
where an optional element can occur between two 
other elements with some probability. The recognition 
of non-adjacent relationships can reduce the need to 
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Figure 1 . Formal language hierarchy (FLH) distinguishing learning versus production and the notion of a gradient of syntactic 
complexity, (a) Schematic of FLH, based on Berwick et at. [6]. (6) Our illustrated distinction between learning and production 
in relation to the FLH, highlighting considerable uncertainty in how high human and other animal learning (rather than pro- 
duction) capabilities can reach in the hierarchy, see text, (c) Schematized for quantifying the different dimensions of syntactic 
complexity, on the vertical axis, a measure of linearity can be used as a function of increasing memory demands on the hori- 
zontal axis. Other ways of quantifying complexity in different dimensions would also be useful to test. At the lowest level are 
single element/state systems, followed by multi-state linear systems with (i) only 'adjacent relationships', (ii) forward branching 
systems including £ non-adjacent relationships' that some animals might be able to learn, and (iii) state repetitions that might 
tap into numerosity sensitivity. Higher still are 'state chains' that cannot be solved by first-order Markov models; here the tran- 
sition following each c a' element depends on the preceding state transition. This process is second-order Markov, but all other 
transitions are first-order Markov processes [1]. 



memorize many pairwise transitions if the non- levels of complexity (e.g. nested or crossed relation- 
adjacency 'rule' can be learned. For adult humans, ships [38,39]). Moreover, the ability to deal with 
non- adjacent relationships can include even greater non-adjacent relationships is not present at birth but 



Phil Trans. R. Soc. B (2012) 



Review. Primate proto-syntactic hypotheses C. I. Petkov and B. Wilson 2081 



seems to occur during infant development [40,41] . As a 
final example of another level of syntactic complexity 
(figure lc), Hurford notes the special case of the same 
element occurring in multiple parts of the sequence 
where its next transition state depends on the preceding, 
called a 'state chain' process [1]. Such transitions 
require higher order Markov models, although much 
of the rest of the sequence could remain a first-order 
Markov process. 

We hope that these examples help to illustrate the 
great variety seen in animal song production that can 
be usefully applied towards quantifying the structural 
complexity between different artificial grammars, 
prior to using these in comparative tests with different 
animal species. It would be of benefit to many if the 
scientific community works together to rank the com- 
plexity of these structures along different dimensions 
(using quantitative rather than qualitative descriptions, 
wherever possible). Subsequently, the learning abilities 
of animals can be evaluated along the various 
dimensions of 'syntactic complexity' to advance our 
understanding of the evolutionary bases for human 
syntactic abilities. It remains possible that the 
evolution of syntactic complexity may have been 
step-wise rather than, as we have proposed, a gradient 
function. Yet, if the pursuit is informative regarding 
how language may have evolved, we welcome the test- 
ing of different alternative hypotheses. As we will 
discuss in §4, there is already a basis for considering 
syntactic complexity from the human cognitive neuro- 
science literature, where, for instance, the comparison 
of adjacent versus non-adjacent relationships (broadly 
defined) seem to be able to predict which parts of 
the human language network are engaged [42] . 

3. OBTAINING COMPARATIVE DATA ON 
ARTIFICIAL GRAMMAR LEARNING: IMPLICIT 
VERSUS EXPLICIT LEARNING 

Classically, the behavioural approach has been a tool 
of choice for comparative biologists and psychologists. 
However, even behavioural testing is challenging to 
apply in the same way across species that may have 
different forms of communication, different levels of 
motivation, varying abilities to engage in behavioural 
testing and that may find different methods of provid- 
ing responses more natural than others. Combining 
behavioural study with neurobiological measurements 
that can be performed in a similar way across the 
species escalates the obstacles to success. Yet, bridging 
techniques and approaches are required to link 
research based on the study of different species. In 
this section, we consider (i) how AGL can be used 
to study implicit or explicit learning processes and 
(ii) several approaches in which behavioural and neu- 
robiological data can be similarly obtained across 
species to facilitate comparative testing. 

The use of AGL paradigms is a promising approach 
for understanding what aspects of syntactic-related pat- 
terns can be learned by animals. Following Chomsky's 
theoretical formulations of the structure of language 
[17], Reber pioneered the use of artificial language para- 
digms to study how humans learn language structure 
[10]. AGL paradigms have been used to explore the 



types of structures that humans (including infants and 
adults), songbirds, non-human primates and rodents 
can learn [33,43-47]. However, there are differences 
in how some of these study groups have been tes- 
ted such that different learning substrates might have 
been engaged. 

The infant and non-human primate data have 
tended to be obtained relying on the implicit learning 
of artificial grammars, which is often studied by 
measuring preferential looking during habituation/ 
dishabituation paradigms [11,14,44,48]. Typically, 
these experiments are conducted by familiarizing the 
individual for some length of time with exemplary 
sequences of stimuli that follow the artificial-grammar 
pattern or rule(s) [11,12,14,33,44,45,48]. Then in 
the second 'testing phase' of the experiment, the 
individual is tested with well-formed 'correct' or 
Violation' sequences during natural response 
measurements, such as preferential looking towards 
the audio speaker that presented the test sequence. 
In this way, the familiarization and testing need not 
engage perceptual awareness for learning to have 
occurred, i.e. implicit learning [1 1,44,45,48] . However, 
in the bird and rodent studies, the participants were 
trained to discriminate correct versus violation 
sequences, which could engage an explicit rather than 
implicit learning system [12,33,46,49]. Similarly, in 
many of the human studies [43,50,51] either during 
the familiarization phase or during the testing phase, 
the participants were engaged in learning the sequen- 
cing structure of the artificial grammar by being asked 
to judge whether the sequences were correct or violation 
sequences. When participants are actively seeking to 
determine the artificial-grammar pattern, there is a 
risk that they might fail to learn some of the sequencing 
relationships after the point at which they feel that they 
have sufficiently understood the pattern and are per- 
forming reasonably well. Such explicit learning could 
engage different brain circuits [52] in relation to studies 
of AGL using implicit learning (such as those in infants 
and non-human primates) . 

More recently, the groups of Hagoort and Petersson 
have worked to engage adult humans in more implicit 
learning paradigms, whereby little instruction is given 
to participants during testing other than to report by 
pressing one of two buttons their preference for a test 
sequence (i.e. whether they 'liked' the sequence or 
not). Subsequent to this, the participants were asked 
to make 'grammatically' judgements both to validate 
the preference judgements and to engage explicit learn- 
ing [38,39]. Interestingly, both implicit and explicit 
AGL is reported as yielding fairly comparable results. 
Both seem to engage the inferior-frontal gyrus (IFG), 
e.g. Broca's territory (Brodmann areas (BA) 44/45), as 
has been reported in several other human AGL or natu- 
ral language-learning studies. 

When comparing data with animals such as non- 
human primates that are limited vocal learners, an 
advantage of using implicit rather than explicit learning 
of artificial-grammar sequences is to avoid engaging the 
aspects of the network that in vocal learners such as 
humans and songbirds might form part of the network 
engaged in vocal production. Implicit learning might 
be better able to distinguish perception from motor 
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marmoset 



Figure 2. Eye-tracking measurement of implicit artificial grammar learning, (a) Schematic of a behavioural eye-tracking exper- 
iment with monkeys in our laboratory. The monkey sits in front of a monitor and after a brief central fixation period, an 
auditory test sequence is randomly presented from the left or right audio speaker. The length of time spent looking into 
the predefined analysis region around the presenting audio speaker is measured. (6) Exemplary eye-traces towards correct 
('grammatical') and violation ( c ungrammaticaP) sequences. Positive values in the plot indicate eye movements towards the 
test speaker location, whichever audio speaker it was; negative values are looks in the opposite direction away from the present- 
ing audio speaker, (c-f) exemplary non-invasive infra-red eye tracking of (c) adult human, (d) human infant (image courtesy of 
J. Read), (e) macaque and (J) marmoset. 



production in the service of perception by reducing the 
ability of vocal learners to rely on sub-articulation, imi- 
tation, etc. to assist in the perception of syntactic 
sequences. Otherwise, several aspects of the networks 
that support syntactic or syntactic-like learning in 
vocal learners would by comparison to vocal non- 
learners appear to be strikingly different (e.g. human 
or songbird unique) . For a more detailed discussion of 
the similarities and differences in the behaviour and 
neurobiology of vocal learners (such as songbirds 
and humans) and other animals with more limited 
vocal learning abilities (such as non-human primates 
and other birds), see Petkov & Jarvis [25]. 

Another way in which comparative testing can 
be facilitated is to use similar behavioural and neuro- 
biological measurements between humans, infants 
and non-human animals. For instance, for behaviou- 
ral testing, infra-red eye tracking has become more 
available in scientific laboratories and can be used 
to evaluate preferential looking responses after 



habituation to artificial-grammar sequences. This is 
shown for macaques in figure 2 a, b and can be 
comparably conducted in adult humans, infants 
and other types of monkeys, such as marmosets 
(figure 2c-/). Apart from the advantage of using eye 
tracking to measure implicit learning similarly across 
participant groups, the approach also offers a more 
objective way to analyse behavioural data, in relation 
to the traditional approach of manually rating the 
animals' responses as captured on video, which 
has been criticized [34]. Other groups have opted to 
use brain potentials both to obtain neurobiological 
data after AGL and to evaluate whether, for 
instance, infant brain potentials show a signature of 
learning [40]. 

Many neuroscientific studies are conducted in 
anaesthetized animals. However, comparative AGL 
studies using evoked potentials or brain neuroimaging 
will depend on the animals being studied awake, rather 
than anaesthetized. Technical advances have made it 
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possible to accommodate non-human animals so that 
they can be scanned awake with functional magnetic- 
resonance imaging (fMRI), which is often used to 
scan humans [53,54]. Moreover, although the gradient 
systems of MRI scanners generate a considerable 
amount of noise, animal MRI studies often use strat- 
egies to reduce the impact of scanner noise on the 
animals and to improve the auditory activity response 
during sound stimulation [55,56]. Recent fMRI and 
positron emission tomography (PET) studies have 
been describing how the brains of non-human primates 
process communication signals (monkeys [57-60]; 
chimpanzees [61]). General summaries are now avail- 
able on how the results in monkeys and apes relate to 
how the human brain processes species-specific com- 
munication signals [62,63]. In this way, testing for the 
level of correspondence across the species, rather than 
assuming that it exists, provides a stronger bridge 
between human neuroimaging work and studies in cer- 
tain species of non-human primates, where for instance 
the processing of communication signals can be studied 
at the neuronal level [64-66]. As a specific example, 
an fMRI-based correspondence has recently been 
suggested between how human [67,68] and monkey 
[57] brains process voice content in communica- 
tion sounds; see Petkov et al. [62]. Subsequently, 
fMRI-guided electrophysiology was used in the mon- 
keys to target fMRI-identified voice-sensitive brain 
clusters which when studied seemed to reveal Voice 
cells' in the primate brain [69]. A similar two-stage 
approach — linking human neuroimaging results on 
language-related processes using a bridging technique 
followed by the neuronal-level study of potential homol- 
ogues in an animal model system — could provide novel 
insights into the cellular function of evolutionarily con- 
served regions than in humans evolved to support 
language-related processes. 

4. NEUROBIOLOGICAL HYPOTHESES ON THE 
PROTO-SYNTACTIC LEARNING NETWORK 
IN MONKEYS 

There is a growing consensus among scientists that the 
prominent brain regions in humans that are engaged 
in syntactic processes involve the left inferior and 
middle frontal cortex, large parts of the superior and 
middle temporal cortex, parts of the parietal cortex 
and subcortical regions such as the basal ganglia, as 
well as a number of these same regions in the right hemi- 
sphere [42,70]. Many of these brain regions appear to 
be engaged both during syntactic processing of natural 
language [71,72] and when human participants evalu- 
ate artificial-grammar sequences [13,38,43,50]. Thus, 
a considerable amount of language-related processing 
does not appear to be strictly language-specific. Friederici 
[42] has recently proposed an extensive model integrat- 
ing information on the structure, function and 
connectivity of the human brain network that subserves 
language processing. Important to this model is how 
different behavioural demands can engage different 
aspects of the language network [42], thus, we next 
overview some of the key concepts that are relevant 
for neurobiological hypotheses of proto-syntactic net- 
works in non-human primates. For other models, 



including those that focus on human speech processing 
and the relevance to brain pathways for auditory pro- 
cessing in primates, see [73,74]. 

— Several language pathways. Human semantic and syn- 
tactic processing engages several brain pathways: two 
dorsal pathways link posterior temporal and parietal 
lobe regions with either premotor cortex BA 6 
(dorsal pathway I; via the superior longitudinal fasci- 
culus (SLF)) or BA 44 in Broca's territory (dorsal 
pathway II; via the arcuate fasciculus, a part of the 
SLF). Two ventral pathways are hypothesized to 
link anterior supra-temporal lobe regions and either 
BA 45 in Broca's territory (ventral pathway I; via 
the extreme capsule (EC) fibre system) or the frontal 
operculum (FOP) area below BA 44/45 (ventral 
pathway II; via the uncinate faciculus (UF)). 

— Syntactic complexity demands on the network. For initial 
syntactic structural analysis, the FOP and ventral 
pathway II are engaged (including for finite-state 
grammars such as (AB) W that monkeys and songbirds 
appear able to learn [11,12,50]). Dorsal pathway II 
(arcuate fasciculus) and BA 44 are critical for syntac- 
tic function, such as evaluating hierarchical structure 
and c non-adjacent relationships' of various types 
[13,43]. Dorsal pathway II (to BA 44) and ventral 
pathway I (to BA 45) are engaged in semantic and 
syntactic relationships or syntactic movement (e.g. 
evaluating whether a sentence structure is subject - 
verb -object versus object- subject- verb). Higher 
memory demands and longer distance non- adjacent 
relationships engage Broca's territory (BA 44 in par- 
ticular) and dorsal pathway I to premotor cortex. 
However, dorsal pathway I is primarily involved in 
sensory-to-motor mapping. 

— Left hemisphere dominant and subcortical structures 
can be engaged. The syntactic/semantic network in 
frontal cortex tends to be left lateralized, see also 
[72,75]. The right hemisphere is thought to 
mainly subserve functions such as the prosodic 
and emotional aspects associated with linguistic 
comprehension. Subcortical structures such as 
the hippocampus and basal ganglia can be differ- 
ently engaged relative to, e.g. BA 44, at different 
stages of syntactic learning [76]. 

Based on these considerations, several hypotheses can 
be articulated that consider the level of complexity that 
non-human primates are capable of learning and the 
neurobiological regions and pathways that might be 
engaged. For clarity in illustration, in figure 3, we sub- 
divide the likely AGL capabilities into abilities for 
evaluating adjacent relationships alone or with non- 
adjacent relationships. See §2 and figure 1 for other 
aspects of syntactic complexity that could also be 
useful for testing. Moreover, since we are considering 
AGL of the temporal structure of sensory elements, 
it is an open question whether, all things equal, all 
presumed homologues of the pathways that have 
been described in humans would be engaged (e.g. 
dorsal pathway I to premotor cortex that is engaged 
in sensory-to-motor mapping might not be involved 
in this case). Also, although traditionally the dorsal 
arcuate fasciculus is considered as the classical 
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(a) hypothesis 1: 

ventral proto-syntactic pathway 

central sulcus 




adjacent relationships (FSG) 
non-adjacent relationships (FSG) 



0 
□ 



(b) hypothesis 2: 

dorsal proto-syntactic pathway 




adjacent relationships (FSG) 
non-adjacent relationships (FSG) 



0 
□ 



(c) hypothesis 3: 

multiple pathways as a function 
of proto-syntactic complexity 



(d) other hypothetical variations: 
engage cortico- striatal-thalamic system 
engage more bilateral system 




adjacent relationships (FSG) [7f 
non- adjacent relationships (FSG) [7| 




right 
hemisphere 



macaque brain 



Figure 3. Hypothetical proto-syntactic learning capabilities and neurobiological substrates in monkeys, (a) Hypothesis 1 illus- 
trates a ventral pathway linking the supratemporal plane with inferior frontal cortex. Here., the animals are only able to learn 
adjacent relationships in finite-state grammars (FSGs). (b) Hypothesis 2 illustrates a dorsal pathway supporting the learning of 
FSG. (c) Hypothesis 3 illustrates the reliance on multiple pathways and regions depending on the complexity of the FSG pat- 
terns that can be learned (e.g. for adjacent relationships, a ventral pathway; for non-adjacent relationships, a dorsal pathway 
and/or a different part of the ventral pathway), (d) A discussion of variations to these hypotheses, see text. AC, auditory cortex; 
aSt, anterior striatum; Gp, globus pallidus; vF4/vF5, ventral frontal cortical areas F4 and F5 [82]; VL, ventro-lateral thalamus; 
44/45, Brodmann areas 44/45. 



language pathway linking Broca's and Wernicke's terri- 
tories, the ventral pathway(s) and their role in 
language processes are being emphasized by some 
groups [77-79]. However, although the ventral UF 
and EC pathways are anatomically evident in non- 
human primates, our hypotheses at this point only 
make predictions about the EC pathway since it is 
the primary ventral fronto-temporal tract that can cur- 
rently be resolved with in vivo connectivity studies of 
the IFG in monkeys and apes [80,81]. 

(a) Hypothesis 1: ventral pathway for 
proto-syntactic learning 

There is evidence that tamarin monkeys are able to 
learn adjacent relationships in FSGs, but are insensi- 
tive to violations of more complex grammatical 
patterns [11]. Also, human fMRI results suggest that 
the processing of such adjacent relationships engages 
the FOP, more so than Broca's territory [50]. How- 
ever, in humans the processing of various sorts of 
non- adjacent relationships in artificial grammars 
[13,50], including those with hierarchical structure 
[43], engages at least Broca's territory, e.g. BA 44. 
Thus, one hypothesis is that the involvement of 



Broca's territory (and the dorsal SLF pathway) 
would not be seen in non-human primates [9,50], 
especially if the animals are not capable of evaluating 
non-adjacent relationships. In this scenario, when evaluat- 
ing adjacent relationships or simpler syntactic-related 
relationships, both humans and some species of non- 
human primates might engage a ventral pathway (EC 
and/or UF) interconnecting anterior temporal lobe 
regions and frontal cortical areas that are inferior to 
BA 44/45 (e.g. in monkeys, the frontal opercular areas 
or areas vF5/F4 [82]). We illustrate this scenario in 
figure 3a (hypothesis 1: ventral pathway). If, the non- 
human primates are able to evaluate non-adjacent 
relationships and for this engage the ventral pathway, 
then this would suggest that the human dorsal pathway 
involving the arcuate fasciculus differentiated during 
language evolution to support increasing syntactic 
complexity, as Rilling et al. [80] have suggested. 

(b) Hypothesis 2: dorsal pathway for 
proto-syntactic learning 

A second hypothesis is that the processing of FSGs 
with only adjacent relationships engages monkey homol- 
ogues of BA 44/45 and the dorsal SLF pathway, as 



Phil Trans. R. Soc. B (2012) 



Review. Primate proto-syntactic hypotheses C. I. Petkov and B. Wilson 2085 



illustrated in figure 3b (hypothesis 2: dorsal pathway). 
This would also suggest that the dorsal pathway 
differentiated after the split from a common 
ancestor to support the learning of greater syntactic 
complexity in humans. 

(c) Hypothesis 3: multiple pathways in 
non-human primates for proto-syntactic 
learning depend on syntactic complexity 

A third hypothesis is that different brain regions and 
pathways are engaged depending on the complexity of 
the grammars that can be learned. For instance, any 
combination of the following might be possible: (i) 
parts of the ventral pathway linking temporal lobe 
regions to monkey homologues of the human FOP are 
engaged in the processing of adjacent relationships in 
FSGs; (ii) the dorsal pathway is relied on for processing 
greater complexity in FSGs, such as non-adjacent 
relationships [13]; and/or (iii) different parts of the ven- 
tral pathway are engaged in evaluating either adjacent or 
non-adjacent relationships (figure 3c: hypothesis 3: 
multiple pathways). The combination of these scenarios 
in monkeys might be viewed to be the most comparable 
to how the human brain processes syntactic complexity, 
but there could be subtle differences. For instance, 
would the processing of comparable adjacent and non- 
adjacent relationships in artificial grammars engage a 
broader set of regions in frontal cortex in monkeys? If 
so, this could suggest a different form of functional differ- 
entiation during human language evolution from the ones 
considered for the other hypotheses above. For example, 
the ventral pathway and BA 45 might in humans have had 
to differentiate to support the combination of semantic 
and syntactic relationships [42] . 

(d) Other variants and hypotheses 

The human syntactic learning network is also not 
entirely left lateralized [70], nor is the processing of 
communication sounds in humans, chimpanzees or 
monkeys [62,63]. Thus, it is possible that the right 
hemisphere in non-human primates might show 
some of the homotopic regions and connectivity illus- 
trated here for the left hemisphere. Also, for brevity, 
the hypotheses of figure 3 do not illustrate the possible 
greater or lesser reliance on subcortical structures 
(such as the striatum and basal ganglia) or cerebellum 
to support, for instance, the implicit learning of artifi- 
cial-grammar sequences. Fitch [83] proposed three 
interesting hypotheses regarding how the human syn- 
tactic network might differ from ancestral variants 
present in living non-human animals. First is the 
notion that human vocal learning involves a direct 
pathway between the regions required for vocal learn- 
ing and the laryngeal motoneurons in the nucleus 
ambiguus in the brainstem. As suggested in §3 
above, we would not expect the vocal production path- 
way to be engaged in (at least) the implicit learning of 
artificial-grammar sequences in non-human primates; 
for more details, see Petkov & Jarvis [25]. The 
second Fitch hypothesis regarding the specialization 
of the arcuate fasciculus [80] is considered in detail 
above. The third of the hypotheses considers the archi- 
tectonic and other specializations of Broca's territory. 



e.g. BA 44, which, if present, might be evident in 
differences in the neurobiological activity and/or con- 
nectivity patterns between humans and monkeys in 
relation to their behavioural capabilities. 

In summary, it is possible that humans engage 
at least Broca's territory and a dorsal pathway to pro- 
cess grammatical complexity in a way that may not 
be evident in non-human primates (hypothesis 1 in 
figure 3). Other possibilities are that monkeys may 
engage homologues of BA 44 and parts of the dorsal 
SLF tract for grammars perceived as simple by 
humans (hypothesis 2), or that there is a general corre- 
spondence between how human and monkey brain 
networks evaluate artificial-grammar complexity 
(hypothesis 3) with more or less subtle differences in 
hemispheric lateralization and/or cortical and sub- 
cortical engagement. It remains to be seen how a 
proto-syntactic network in monkeys would compare to 
the network humans that subserves syntactic learning. 

5. CONCLUSIONS 

At least conceptually, the approach with non-human pri- 
mates and possibly also the one that might be taken with 
other so-called Vocal non-learning' animals must differ 
from the approaches that are being taken with vocal 
learning animals, such as songbirds. On the other 
hand, the comparative testing of behaviour and neuro- 
biology needs to be done as similarly as possible across 
the species so that data can be compared. We have 
aimed to build on the efforts of the international scien- 
tific community to understand the origins of language 
and to open new pathways for pursuing language hom- 
ologues in non-human animals that tend to be dismissed 
from consideration. Work has also begun to refine the 
comparative behavioural testing of humans and non- 
human animals on AGL paradigms and we have 
begun to obtain initial results on monkey AGL with 
fMRI in our laboratory [84]. The constraints that are 
imposed by working with animals that are limited 
vocal learners can also be positively viewed as providing 
important insights, guidance and predictions into the 
ancestral state of the human language-related network 
and its generic processing capabilities. Thereby, the 
comparative approach remains important for under- 
standing language evolution and for the development 
of useful animal model systems to study the evolutiona- 
rily conserved aspects of the human language-related 
network at the cellular and molecular levels. 
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ENDNOTE 

*In some cases, neuronal studies in humans are possible [16]. How- 
ever, great care is required for interpreting the results from clinical 
patients that either involve or neighbour pathological regions that 
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are being monitored for neurosurgical resection. In all cases, infor- 
mation from cell and molecular studies in animals can enhance the 
data from neuronal-level study in humans, provided that the level 
of correspondence across the species is tested using a common 
bridging technique. 
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